Topic:Diabetes Prediction
What is Diabetes Prediction? Diabetes prediction is the process of forecasting the risk of developing diabetes based on health data and other factors.
Papers and Code
Mar 26, 2025
Abstract:Type 1 Diabetes (T1D) management is a complex task due to many variability factors. Artificial Pancreas (AP) systems have alleviated patient burden by automating insulin delivery through advanced control algorithms. However, the effectiveness of these systems depends on accurate modeling of glucose-insulin dynamics, which traditional mathematical models often fail to capture due to their inability to adapt to patient-specific variations. This study introduces a Biological-Informed Recurrent Neural Network (BIRNN) framework to address these limitations. The BIRNN leverages a Gated Recurrent Units (GRU) architecture augmented with physics-informed loss functions that embed physiological constraints, ensuring a balance between predictive accuracy and consistency with biological principles. The framework is validated using the commercial UVA/Padova simulator, outperforming traditional linear models in glucose prediction accuracy and reconstruction of unmeasured states, even under circadian variations in insulin sensitivity. The results demonstrate the potential of BIRNN for personalized glucose regulation and future adaptive control strategies in AP systems.
* Accepted for pubblication in the proceedings of the Engineering
Diabetes Technologies (EDT 2025). 7 pages, 2 figures and 1 table
Via

Mar 20, 2025
Abstract:Artificial intelligence (AI) is expected to revolutionize the practice of medicine. Recent advancements in the field of deep learning have demonstrated success in a variety of clinical tasks: detecting diabetic retinopathy from images, predicting hospital readmissions, aiding in the discovery of new drugs, etc. AI's progress in medicine, however, has led to concerns regarding the potential effects of this technology upon relationships of trust in clinical practice. In this paper, I will argue that there is merit to these concerns, since AI systems can be relied upon, and are capable of reliability, but cannot be trusted, and are not capable of trustworthiness. Insofar as patients are required to rely upon AI systems for their medical decision-making, there is potential for this to produce a deficit of trust in relationships in clinical practice.
* 2020. Journal of Medical Ethics 46(7): 478-481
* 12 pages
Via

Mar 15, 2025
Abstract:Coronary artery disease remains one of the leading causes of mortality globally. Despite advances in revascularization treatments like PCI and CABG, postoperative stroke is inevitable. This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk in coronary revascularization patients.This research employed data from the MIMIC-IV database, consisting of a cohort of 7023 individuals. Study data included clinical, laboratory, and comorbidity variables. To reduce multicollinearity, variables with over 30% missing values and features with a correlation coefficient larger than 0.9 were deleted. The dataset has 70% training and 30% test. The Random Forest technique interpolated residual dataset missing values. Numerical values were normalized, whereas categorical variables were one-hot encoded. LASSO regularization selected features, and grid search found model hyperparameters. Finally, Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable. AUC of 0.855 (0.829-0.878) showed that SVM model outperformed logistic regression and CatBoost models in prior research. SHAP research showed that the Charlson Comorbidity Index (CCI), diabetes, chronic kidney disease, and heart failure are significant prognostic factors for postoperative stroke. This study shows that improved machine learning reduces overfitting and improves model predictive accuracy. Models using the CCI alone cannot predict postoperative stroke risk as accurately as those using independent comorbidity variables. The suggested technique provides a more thorough and individualized risk assessment by encompassing a wider range of clinically relevant characteristics, making it a better reference for preoperative risk assessments and targeted intervention.
* 19 pages, 7 figures, submitted to PLOS One. The study employs machine
learning techniques, particularly Support Vector Machines, to predict
postoperative stroke risk in coronary artery disease patients undergoing
revascularization. It utilizes the MIMIC-IV v3.1 database and incorporates
SHapley Additive Properties analysis for model interpretation
Via

Mar 18, 2025
Abstract:Diabetic retinopathy is a leading cause of blindness in diabetic patients and early detection plays a crucial role in preventing vision loss. Traditional diagnostic methods are often time-consuming and prone to errors. The emergence of deep learning techniques has provided innovative solutions to improve diagnostic efficiency. However, single deep learning models frequently face issues related to extracting key features from complex retinal images. To handle this problem, we present an effective ensemble method for DR diagnosis comprising four main phases: image pre-processing, selection of backbone pre-trained models, feature enhancement, and optimization. Our methodology initiates with the pre-processing phase, where we apply CLAHE to enhance image contrast and Gamma correction is then used to adjust the brightness for better feature recognition. We then apply Discrete Wavelet Transform (DWT) for image fusion by combining multi-resolution details to create a richer dataset. Then, we selected three pre-trained models with the best performance named DenseNet169, MobileNetV1, and Xception for diverse feature extraction. To further improve feature extraction, an improved residual block is integrated into each model. Finally, the predictions from these base models are then aggregated using weighted ensemble approach, with the weights optimized by using Salp Swarm Algorithm (SSA).SSA intelligently explores the weight space and finds the optimal configuration of base architectures to maximize the performance of the ensemble model. The proposed model is evaluated on the multiclass Kaggle APTOS 2019 dataset and obtained 88.52% accuracy.
Via

Mar 06, 2025
Abstract:Diabetes is a prevalent chronic disease with significant health and economic burdens worldwide. Early prediction and diagnosis can aid in effective management and prevention of complications. This study explores the use of machine learning models to predict diabetes based on lifestyle factors using data from the Behavioral Risk Factor Surveillance System (BRFSS) 2015 survey. The dataset consists of 21 lifestyle and health-related features, capturing aspects such as physical activity, diet, mental health, and socioeconomic status. Three classification models, Decision Tree, K-Nearest Neighbors (KNN), and Logistic Regression, are implemented and evaluated to determine their predictive performance. The models are trained and tested using a balanced dataset, and their performances are assessed based on accuracy, precision, recall, and F1-score. The results indicate that the Decision Tree, KNN, and Logistic Regression achieve an accuracy of 0.74, 0.72, and 0.75, respectively, with varying strengths in precision and recall. The findings highlight the potential of machine learning in diabetes prediction and suggest future improvements through feature selection and ensemble learning techniques.
* 5 pages, 2 figures, submitted CSCSU 2025
Via

Mar 07, 2025
Abstract:Insulin resistance (IR) is a key precursor to diabetes and a significant risk factor for cardiovascular disease. Traditional IR assessment methods require multiple blood tests. We developed a simple AI model using only fasting blood glucose to predict IR in non-diabetic populations. Data from the NHANES (1999-2020) and CHARLS (2015) studies were used for model training and validation. Input features included age, gender, height, weight, blood pressure, waist circumference, and fasting blood glucose. The CatBoost algorithm achieved AUC values of 0.8596 (HOMA-IR) and 0.7777 (TyG index) in NHANES, with an external AUC of 0.7442 for TyG. For METS-IR prediction, the model achieved AUC values of 0.9731 (internal) and 0.9591 (external), with RMSE values of 3.2643 (internal) and 3.057 (external). SHAP analysis highlighted waist circumference as a key predictor of IR. This AI model offers a minimally invasive and effective tool for IR prediction, supporting early diabetes and cardiovascular disease prevention.
* 20 pages, 8 figures
Via

Mar 12, 2025
Abstract:Accurate staging of Diabetic Retinopathy (DR) is essential for guiding timely interventions and preventing vision loss. However, current staging models are hardly interpretable, and most public datasets contain no clinical reasoning or interpretation beyond image-level labels. In this paper, we present a novel method that integrates graph representation learning with vision-language models (VLMs) to deliver explainable DR diagnosis. Our approach leverages optical coherence tomography angiography (OCTA) images by constructing biologically informed graphs that encode key retinal vascular features such as vessel morphology and spatial connectivity. A graph neural network (GNN) then performs DR staging while integrated gradients highlight critical nodes and edges and their individual features that drive the classification decisions. We collect this graph-based knowledge which attributes the model's prediction to physiological structures and their characteristics. We then transform it into textual descriptions for VLMs. We perform instruction-tuning with these textual descriptions and the corresponding image to train a student VLM. This final agent can classify the disease and explain its decision in a human interpretable way solely based on a single image input. Experimental evaluations on both proprietary and public datasets demonstrate that our method not only improves classification accuracy but also offers more clinically interpretable results. An expert study further demonstrates that our method provides more accurate diagnostic explanations and paves the way for precise localization of pathologies in OCTA images.
* 11 pages, 3 figures
Via

Mar 10, 2025
Abstract:In recent years, large language models (LLMs) have demonstrated remarkable potential across various medical applications. Building on this foundation, multimodal large language models (MLLMs) integrate LLMs with visual models to process diverse inputs, including clinical data and medical images. In ophthalmology, LLMs have been explored for analyzing optical coherence tomography (OCT) reports, assisting in disease classification, and even predicting treatment outcomes. However, existing MLLM benchmarks often fail to capture the complexities of real-world clinical practice, particularly in the analysis of OCT images. Many suffer from limitations such as small sample sizes, a lack of diverse OCT datasets, and insufficient expert validation. These shortcomings hinder the accurate assessment of MLLMs' ability to interpret OCT scans and their broader applicability in ophthalmology. Our dataset, curated through rigorous quality control and expert annotation, consists of 439 fundus images and 75 OCT images. Using a standardized API-based framework, we assessed seven mainstream MLLMs and observed significant variability in diagnostic accuracy across different diseases. While some models performed well in diagnosing conditions such as diabetic retinopathy and age-related macular degeneration, they struggled with others, including choroidal neovascularization and myopia, highlighting inconsistencies in performance and the need for further refinement. Our findings emphasize the importance of developing clinically relevant benchmarks to provide a more accurate assessment of MLLMs' capabilities. By refining these models and expanding their scope, we can enhance their potential to transform ophthalmic diagnosis and treatment.
Via

Mar 05, 2025
Abstract:Postprandial hyperglycemia, marked by the blood glucose level exceeding the normal range after meals, is a critical indicator of progression toward type 2 diabetes in prediabetic and healthy individuals. A key metric for understanding blood glucose dynamics after eating is the postprandial area under the curve (PAUC). Predicting PAUC in advance based on a person's diet and activity level and explaining what affects postprandial blood glucose could allow an individual to adjust their lifestyle accordingly to maintain normal glucose levels. In this paper, we propose GlucoLens, an explainable machine learning approach to predict PAUC and hyperglycemia from diet, activity, and recent glucose patterns. We conducted a five-week user study with 10 full-time working individuals to develop and evaluate the computational model. Our machine learning model takes multimodal data including fasting glucose, recent glucose, recent activity, and macronutrient amounts, and provides an interpretable prediction of the postprandial glucose pattern. Our extensive analyses of the collected data revealed that the trained model achieves a normalized root mean squared error (NRMSE) of 0.123. On average, GlucoLense with a Random Forest backbone provides a 16% better result than the baseline models. Additionally, GlucoLens predicts hyperglycemia with an accuracy of 74% and recommends different options to help avoid hyperglycemia through diverse counterfactual explanations. Code available: https://github.com/ab9mamun/GlucoLens.
* 8 pages, 5 figures
Via

Mar 07, 2025
Abstract:The growing rate of chronic wound occurrence, especially in patients with diabetes, has become a concerning trend in recent years. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Chronic wounds can have devastating consequences for the patient, with infection often leading to reduced quality of life and increased mortality risk. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to both patient and clinician. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf
Via
